Skip to content

Constrain server data file paths#1423

Open
fallintoplace wants to merge 2 commits into
NVIDIA:mainfrom
fallintoplace:fix/server-data-file-path-boundary
Open

Constrain server data file paths#1423
fallintoplace wants to merge 2 commits into
NVIDIA:mainfrom
fallintoplace:fix/server-data-file-path-boundary

Conversation

@fallintoplace

Copy link
Copy Markdown

Summary

The local-file request path for cuopt-data-file now enforces the documented CUOPT_DATA_DIR boundary.

Previously the server joined CUOPT_DATA_DIR with the header value and only checked existence. Absolute paths and .. segments could resolve outside the configured data directory, and symlinks could do the same.

This change rejects absolute input paths, resolves both the configured root and requested file with realpath, verifies the result stays under CUOPT_DATA_DIR, and requires the target to be a regular file.

Validation

  • python3 -m py_compile python/cuopt_server/cuopt_server/webserver.py python/cuopt_server/cuopt_server/tests/test_file_path_validation.py
  • git diff --check
  • uvx --with ruff ruff check python/cuopt_server/cuopt_server/webserver.py python/cuopt_server/cuopt_server/tests/test_file_path_validation.py
  • PYTHONPATH=python/cuopt_server uvx --with 'pytest<9.0' --with fastapi --with jsonref==1.1.0 --with msgpack==1.1.2 --with msgpack-numpy==0.4.8 --with psutil --with uvicorn --with numpy --with pandas pytest python/cuopt_server/cuopt_server/tests/test_file_path_validation.py -q

The pytest run passed with existing Pydantic deprecation warnings during server import.

@fallintoplace fallintoplace requested a review from a team as a code owner June 10, 2026 21:42
@copy-pr-bot

copy-pr-bot Bot commented Jun 10, 2026

Copy link
Copy Markdown

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 0c508de8-2521-4402-98f9-b9b0dbe4d425

📥 Commits

Reviewing files that changed from the base of the PR and between 0c5676c and 56f787d.

📒 Files selected for processing (2)
  • python/cuopt_server/cuopt_server/tests/test_file_path_validation.py
  • python/cuopt_server/cuopt_server/webserver.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • python/cuopt_server/cuopt_server/webserver.py

📝 Walkthrough

Walkthrough

Rewrites validate_file_path to enforce CUOPT data directory containment (reject absolute paths, traversal, and symlink escapes) and adds tests exercising valid relative paths, unset data dir, absolute paths, parent-directory traversal, non-regular targets, and symlink escapes.

Changes

File Path Validation Security

Layer / File(s) Summary
Path validation implementation
python/cuopt_server/cuopt_server/webserver.py
validate_file_path now requires settings.get_data_dir() to be set, rejects absolute cuopt_data_file inputs, resolves real paths and enforces containment within the configured data directory via os.path.commonpath, checks the resolved path exists as a regular file, and raises HTTP 400 with specific error messages for each failure mode.
Path validation test suite
python/cuopt_server/cuopt_server/tests/test_file_path_validation.py
Adds an autouse fixture that snapshots and restores settings data dir and seven tests: valid relative path inside data dir; rejection when data dir unset/empty; rejection of absolute paths outside data dir; rejection of parent-directory traversal (../...); rejection when target is not a regular file (directory); and rejection of symlink-based escapes. All rejections assert HTTP 400 and expected error detail text.

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title 'Constrain server data file paths' directly describes the main objective of the PR: enforcing boundaries for data file path handling in the server.
Description check ✅ Passed The description is clearly related to the changeset, explaining the security issue, the implementation approach, and providing validation steps performed.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
python/cuopt_server/cuopt_server/webserver.py (1)

1047-1052: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

The boundary check is still bypassable via TOCTOU.

Line 1051 validates the pathname, but Lines 1142-1154 enqueue only the string for later use in another worker. If CUOPT_DATA_DIR is writable by an untrusted process, the validated file can be replaced with a different file or symlink after the check and before the solver opens it, which reintroduces the escape this PR is trying to close. Open the file here and pass an fd/bytes, or re-resolve and validate again at the actual open site.

Also applies to: 1142-1154

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@python/cuopt_server/cuopt_server/webserver.py` around lines 1047 - 1052, The
code validates the pathname via validate_file_path(cuopt_data_file) and then
later enqueues only the string (file_path), leaving a TOCTOU window if
CUOPT_DATA_DIR is writable; fix by opening the file immediately after
validate_file_path (e.g., obtain a file descriptor or read bytes) and pass that
safe handle/data into the queue instead of the filename, or if you must enqueue
a path, re-resolve and re-run validate_file_path at the actual open site (the
worker that processes the queue, i.e., where you currently handle lines
~1142-1154) and ensure you use openat with the directory fd or verify inode/dev
after open to prevent symlink/race attacks.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@python/cuopt_server/cuopt_server/tests/test_file_path_validation.py`:
- Around line 22-74: Add tests in test_file_path_validation.py to cover the two
missing branches in webserver.validate_file_path: (1) the CUOPT_DATA_DIR-unset
path by clearing/unsetting settings.set_data_dir (or not calling it) and calling
validate_file_path to assert it raises the expected HTTPException/status and
message about CUOPT_DATA_DIR being unset; and (2) the "must be a regular file"
path by creating a non-regular target inside the data dir (e.g., a subdirectory
or FIFO) and calling validate_file_path on that name to assert it raises
HTTPException with the "regular file" (or similar) detail. Use the existing test
helpers and exception assertions (pytest.raises and exc_info.value.detail) and
reference validate_file_path and settings.set_data_dir so the new tests exercise
those branches.

In `@python/cuopt_server/cuopt_server/webserver.py`:
- Around line 216-220: The log call leaks the resolved internal filesystem path
via file_path; update the error logging in webserver.py so it does not include
file_path (use cuopt_data_file or a generic message instead) and ensure any
HTTPException detail or other logs never expose the resolved server path; locate
the check using file_path and cuopt_data_file and replace the
logging.error(f"File path '{file_path}'...") with a safe message that references
only cuopt_data_file or a non-path generic string.

---

Outside diff comments:
In `@python/cuopt_server/cuopt_server/webserver.py`:
- Around line 1047-1052: The code validates the pathname via
validate_file_path(cuopt_data_file) and then later enqueues only the string
(file_path), leaving a TOCTOU window if CUOPT_DATA_DIR is writable; fix by
opening the file immediately after validate_file_path (e.g., obtain a file
descriptor or read bytes) and pass that safe handle/data into the queue instead
of the filename, or if you must enqueue a path, re-resolve and re-run
validate_file_path at the actual open site (the worker that processes the queue,
i.e., where you currently handle lines ~1142-1154) and ensure you use openat
with the directory fd or verify inode/dev after open to prevent symlink/race
attacks.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: cadfe213-c21f-4a01-af0b-1bfba8893d48

📥 Commits

Reviewing files that changed from the base of the PR and between c99a5ce and 0c5676c.

📒 Files selected for processing (2)
  • python/cuopt_server/cuopt_server/tests/test_file_path_validation.py
  • python/cuopt_server/cuopt_server/webserver.py

Comment thread python/cuopt_server/cuopt_server/tests/test_file_path_validation.py
Comment thread python/cuopt_server/cuopt_server/webserver.py Outdated
@fallintoplace fallintoplace force-pushed the fix/server-data-file-path-boundary branch from 0c5676c to 56f787d Compare June 11, 2026 16:37
@rgsl888prabhu rgsl888prabhu added non-breaking Introduces a non-breaking change improvement Improves an existing functionality labels Jun 11, 2026
@rgsl888prabhu

Copy link
Copy Markdown
Collaborator

/ok to test 56f787d

1 similar comment
@rgsl888prabhu

Copy link
Copy Markdown
Collaborator

/ok to test 56f787d

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants